Optimization of Cost Function Weights for Unit Selection Speech Synthesis Using Speech Recognition
نویسندگان
چکیده
A well known problem in unit selection speech synthesis is designing the join and target function sub-costs and optimizing their corresponding weights so that they reflect the human listeners’ preferences. To achieve this we propose a procedure where an objective criterion for optimal speech unit selection is used. The objective criterion for tuning the cost function weights is based on automatic speech recognition results. In order to demonstrate the effectiveness of the proposed method listening tests with 31 naive listeners were performed. The experimental results have shown that the proposed method improves speech quality and intelligibility. In order to evaluate the quality of synthesized speech the unit selection speech synthesis system is compared with two other Croatian speech synthesis systems with voices built using the same recorded speech corpus. One of these voices was built with the Festival speech synthesis system using the statistical parametric method and the other is a diphone concatenation based text-to-speech system. The comparison is based on subjective tests using MOS (mean opinion score) evaluation. The system using the proposed method used for cost function weights optimization performs better than other compared systems according to the subjective tests.
منابع مشابه
Study on Unit-Selection and Statistical Parametric Speech Synthesis Techniques
One of the interesting topics on multimedia domain is concerned with empowering computer in order to speech production. Speech synthesis is granting human abilities to the computer for speech production. Data-based approach and process-based approach are the two main approaches on speech synthesis. Each approach has its varied challenges. Unit-selection speech synthesis and statistical parametr...
متن کاملPerceptual cost functions for unit searching in large corpus-based text-to-speech
In large corpus-based concatenative Text-to-Speech, unit selection is critical for the quality of synthetic speech. Dynamic programming algorithms have been used for unit-searching by minimizing a total cost (1) between target specification and candidate units and (2) between candidate units for concatenation. The cost function is often a weighted sum of sub-costs, which are the costs for each ...
متن کاملOptimization of Unit Selection Speech Synthesis
This paper reports on the improvement of Polish speech synthesis obtained by applying new techniques to BOSS (The Bonn Open Synthesis System) for Polish. In order to enhance the system's performance a variety of set-ups for the cost function, types of units used for concatenation (uniform vs. non-uniform unit selection) and the corpus alignment were tested. Three configurations for segment dura...
متن کاملDiscriminative weight training for unit-selection based speech synthesis
Concatenative speech synthesis by selecting units from large database has become popular due to its high quality in synthesized speech. The units are selected by minimizing the combination of target and join costs for a given sentence. In this paper, we propose a new approach to train the weight parameters associated with the cost functions used for unit selection in concatenative speech synthe...
متن کاملOn the design of cost functions for unit-selection speech synthesis
The quality of the synthetic speech provided by concatenative speech systems depends heavily on the capability of accurately modeling the different characteristics of speech segments. Moreover, the relative significance or weighting of each feature in the unit selection process is a key point in the relationship between synthetic speech and human perception. In this paper we propose a new metho...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2014